AITopics | concentration result

Collaborating Authors

concentration result

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Concentration of General Stochastic Approximation Under Heavy-Tailed Markovian Noise

Agrawal, Shubhada, Maguluri, Siva Theja, Zubeldia, Martin

arXiv.org Machine LearningMay-21-2026

We establish maximal concentration bounds for the iterates generated by stochastic approximation algorithms with general step sizes, where the noise has a finite-state Markovian component plus a Martingale-difference component. When the Martingale-difference noise is bounded, we show that the tail of the error can be sub-Gaussian, sub-Weibull, or something lighter than any Pareto but heavier than any Weibull, depending on the step size sequence and on whether the random operator is almost surely contractive, almost surely non-expansive, or expansive with positive probability. Our analysis relies on a novel Lyapunov function involving the moment-generating function of the solution to a Poisson equation, together with an auxiliary projected algorithm. We complement the upper bounds with worst-case examples showing that qualitatively sharper bounds are impossible. We further study the case of unbounded Martingale-difference noise when the average operator is contractive, and the step sizes are of order $1/k$. In this setting, we show that if the random operator is almost surely non-expansive, then the error tail is at most three times heavier than the noise tail, whereas if the random operator is expansive with positive probability, then the error may have substantially heavier tails. These results are obtained through a novel black-box truncation argument that reduces the unbounded-noise setting to the bounded-noise case.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2605.20999

Country: Europe (0.27)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

bf0857cb9a41c73639f028a80301cdf0-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 21:19:23 GMT

accuracy, artificial intelligence, test accuracy, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.96)

Add feedback

358e4a39b8ace4744fbad77e84a7e757-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 02:20:35 GMT

We call that σ -algebra history before n . Identification tasks We focus on best arm identification (BAI).

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

XXXXX

XXX

Neural Information Processing SystemsFeb-7-2026, 14:34:09 GMT

There have been multiple recent approaches to obtain a near-optimal policy in CMDPs in the regret-minimization or PAC-RL settings [13, 38, 9, 19, 31, 22, 36, 12, 15, 16, 11].

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)

Add feedback

047bf3f8aa5a050351de38df589cc6af-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 07:07:19 GMT

algorithm, bandit, experiment, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (0.68)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Near-optimal Rank Adaptive Inference of High Dimensional Matrices

Zheng, Frédéric, Jedra, Yassir, Proutiere, Alexandre

arXiv.org Machine LearningOct-10-2025

We address the problem of estimating a high-dimensional matrix from linear measurements, with a focus on designing optimal rank-adaptive algorithms. These algorithms infer the matrix by estimating its singular values and the corresponding singular vectors up to an effective rank, adaptively determined based on the data. We establish instance-specific lower bounds for the sample complexity of such algorithms, uncovering fundamental trade-offs in selecting the effective rank: balancing the precision of estimating a subset of singular values against the approximation cost incurred for the remaining ones. Our analysis identifies how the optimal effective rank depends on the matrix being estimated, the sample size, and the noise level. We propose an algorithm that combines a Least-Squares estimator with a universal singular value thresholding procedure. We provide finite-sample error bounds for this algorithm and demonstrate that its performance nearly matches the derived fundamental limits. Our results rely on an enhanced analysis of matrix denoising methods based on singular value thresholding. We validate our findings with applications to multivariate regression and linear dynamical system identification.

algorithm, matrix, probability, (15 more...)

arXiv.org Machine Learning

2510.08117

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

A Some concentration results for uniform random variables

Neural Information Processing SystemsOct-9-2025, 06:21:45 GMT

In this section, we present the proof of Theorem 3.3. In Section B.1, we provide the detail of the Section B.4 and completes the proof. B.2 Upper bound on T L). (26) The KKT condition suggests that the primal-dual optimal pair ( θ This section includes additional experiment results on applying ResMem to CIFAR100 dataset. In addition to the results already presented in Section 4.2, we also evaluate ResMem performance for Figure 4: Test(left)/Training (right) accuracy for different sample sizes. This section includes additional experiment results on applying ResMem to ImageNet dataset.

accuracy, artificial intelligence, test accuracy, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.96)

Add feedback

358e4a39b8ace4744fbad77e84a7e757-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 10:45:38 GMT

algorithm, challenger, empirical performance, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders

Chen, Siyu, Sheen, Heejune, Xiong, Xuyuan, Wang, Tianhao, Yang, Zhuoran

arXiv.org Machine LearningJun-18-2025

We study the challenge of achieving theoretically grounded feature recovery using Sparse Autoencoders (SAEs) for the interpretation of Large Language Models. Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations such as hyperparameter sensitivity and instability. To address these issues, we first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability by modeling polysemantic features as sparse mixtures of underlying monosemantic concepts. Building on this framework, we introduce a new SAE training algorithm based on ``bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity. We theoretically \highlight{prove that this algorithm correctly recovers all monosemantic features} when input data is sampled from our proposed statistical model. Furthermore, we develop an improved empirical variant, Group Bias Adaptation (GBA), and \highlight{demonstrate its superior performance against benchmark methods when applied to LLMs with up to 1.5 billion parameters}. This work represents a foundational step in demystifying SAE training by providing the first SAE algorithm with theoretical recovery guarantees, thereby advancing the development of more transparent and trustworthy AI systems through enhanced mechanistic interpretability.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2506.14002

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Workflow (0.92)
Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

An invariance principle based concentration result for large-scale stochastic pairwise interaction network systems

Como, Giacomo, Fagnani, Fabio, Zampieri, Sandro

arXiv.org Artificial IntelligenceOct-30-2024

We study stochastic pairwise interaction network systems whereby a finite population of agents, identified with the nodes of a graph, update their states in response to both individual mutations and pairwise interactions with their neighbors. The considered class of systems include the main epidemic models -such as the SIS, SIR, and SIRS models-, certain social dynamics models -such as the voter and anti-voter models-, as well as evolutionary dynamics on graphs. Since these stochastic systems fall into the class of finite-state Markov chains, they always admit stationary distributions. We analyze the asymptotic behavior of these stationary distributions in the limit as the population size grows large while the interaction network maintains certain mixing properties. Our approach relies on the use of Lyapunov-type functions to obtain concentration results on these stationary distributions. Notably, our results are not limited to fully mixed population models, as they do apply to a much broader spectrum of interaction network structures, including, e.g., Erd\"oos-R\'enyi random graphs.

lyapunov function, pin model, stationary distribution, (17 more...)

arXiv.org Artificial Intelligence

2410.2282

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Communications > Networks (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback